報酬・意思決定
Reward and Decision Making
P3-2-49
病的賭博患者における報酬予測時の島皮質の活動は疾患の重症度と逆相関していた
Decreased insula activition during reward anticipation in pathological gambling negatively correlated with severity of illness

○鶴身孝介1, 川田良作1, 横山如人1, 村井俊哉1, 高橋英彦1
○Kosuke Tsurumi1, Ryosaku Kawada1, Naoto Yokoyama1, Toshiya Murai1, Hidehiko Takahashi1
京都大学大学院医学研究科脳病態生理学講座(精神医学)1
Kyoto University Department of Psychiatry, Graduate School of Medicine1

Introduction: Pathological gambling (PG) is a chronic mental disorder. PG patients cannot stop gambling behavior despite negative consequences. Accumulative evidence suggests that PG has many similarities with substance use disorders. However, progression of substance use disorders is known to be affected by drugs of abuse, while how PG develops remains unclear. Methods: Twenty PG and 20 age and gender matched healthy controls (HC) were studied. Using fMRI, brain activations during reward anticipation was measured with monetary incentive delay task. Results: During reward anticipation, PG showed decreased activity compared with HC in broad range of reward system, such as the ventral striatum, cingulate cortex, and insula. In PG participants, activition in the insula was negatively correlated with severity of illness.Conclusion: The severer PG was, PG patients showed less activation in insula during reward anticipation. Our findings suggest that activation in the insula during reward anticipation may serve as a marker of severity of PG.
P3-2-50
大脳基底核側坐核コアとシェルによる相補的な行動抑制
The nucleus accumbens core and shell complementary inhibit redundant actions with different manners

○佐藤千佳1, 古舘宏之1, 小林哲也1, 関口達彦2, 木村哲也3, 庄野修2
○Chika Sato1, Hiroyuki Furudate1, Tetsuya Kobayashi1, Tatsuhiko Sekiguchi2, Tetsuya Kimura3, Osamu Shouno2
埼玉大院・理工・生命科学1, (株)ホンダ・リサーチ・インスティチュート・ジャパン2, 国立長寿医療センター・認知症先進医療開発センター3
Div. of Life Science., Grad. Sch. of Science and Engineering, Saitama Univ., Saitama, Japan1, Honda Research Institute Japan Co., Ltd., Wako, Saitama, Japan2, Department of Aging Neurobiology, National Center for Geriatrics and Gerontology, Obu-shi, Aichi, Japan3

We introduced a behavior paradigm, 8-arm food foraging task, to know how animals control their activity level relating to change of reward availability. In the task, well-trained rats reduced their own activity level after the last reward acquisition, and before their first visiting to empty arms (re-visit error), suggesting rats' learning of a way to prospect the timing and/or localization of their acquisition of the last reward in each trial. The accumbens core (AcbC) or shell (AcbS) was lesioned after shaping of the activity-reduction just after the last reward acquisition. We found no influence of both lesions on any behavior parameters before the last reward acquisition. AcbC-lesioned rats showed significantly higher level of locomotion speed just after the last reward acquisition in comparison to controls. The higher activity levels of AcbC-lesions gradually decreased along increment of the total number of their re-visit errors as similar to the ones of controls, suggesting their preserved ability of retrospective control of activity in dependence on their own behavioral results. Thus, it is considered that AcbC is required for rats to reduce their error arm-visits just after the last reward acquisition in the basis on prospection about reward availability. The AcbS-lesions showed the higher tendency of activity for 2-10min after acquisition of all rewards but not during first 2min. Thus, AcbS would not be involved in the prospective reduction of rat's activity in the basis on its prior learning. The total number of re-visit errors did not influence on the activity levels after the last reward acquisition in the AcbS-lesions unlikely to the one in controls. Thus, AcbS contributes for retrospective control of animal's activity in the basis on behavioral results in real-time. The present study demonstrated an important functional differentiation between those Acb sub-regions, which was relating to control of activity-level associating to reward availability.
P3-2-51
確率情報及び意思決定の履歴が知覚的意思決定に作用する神経機構
Neural mechanisms underlying the effect of probabilistic information and decision history on perceptual decision making

○金子宜之1,2, 梅田和昌1, 坂井克之1
○Yoshiyuki Kaneko1,2, Kazumasa Umeda1, Katsuyuki Sakai1
東京大院・医・認知言語神経科学1, 日本大学医学部精神医学系精神医学分野2
Dept Cogn Neurosci, Univ of Tokyo, Tokyo1, Dept Psychiatry, Nihon Univ, Tokyo2

Perceptual decision is known to be biased not only by a cue indicating the probability of a target stimulus, but also by the decision made in the past. Here, using functional magnetic resonance imaging, we show that the effects of probabilistic information and decision history on perceptual decision making are subserved by different neural mechanisms. Normal human subjects made decisions about the presence or absence of downward motion of sine-wave gratings (target) in a noisy background. The probability of the presence of target was informed by a cue presented before a stimulus for each trial. Subjects tended to repeat the same decision as in the previous trial, especially so when the target was present. Using signal detection theory, we found that the previous decision predominantly influenced perceptual sensitivity, whereas the probability cue predominantly influenced decision criteria. During presentation of sine-wave gratings, activation in the left superior temporal sulcus (STS) was higher when the target was present than when it was absent. When the target was present on the current trial, the activation was additionally modulated by the decision on the previous trial. When the target was absent, on the other hand, the activation was modulated by the type of probability cue on that trial. We also found that the STS activation was significantly higher on trials after subjects reported the presence of target than after subjects reported the absence of target, but that the activation was significantly higher on trials with low probability cue than on trials with high probability cue. The results suggest that within the STS, the information about the previous decision and probability cue interacts with current target information in different manners. The results may also suggest that probabilistic information modulates the starting point of the evidence accumulation process, whereas decision history modulates the processing of a target stimulus.
P3-2-52
外側手綱核と前部帯状皮質における学習シグナルの表現
Representation of learning signals in lateral habenula and anterior cingulate cortex

○川合隆嗣1,2,5, 佐藤暢哉2, 高田昌彦1, 松本正幸1,3,4
○Takashi Kawai1,2,5, Nobuya Sato2, Masahiko Takada1, Masayuki Matsumoto1,3,4
京都大・霊長研・統合脳システム1, 関学大院・文・総合心理科学2, 筑波大院・人間総合3, 筑波大・医・生命医科学4, 日本学術振興会特別研究員DC5
Div Sys Neurosci, Pri Res Inst, Kyoto Univ, Inuyama1, Grad Sch Hum, Kwansei Gakuin Univ, Nishinomiya2, Grad Sch Compreh Hum Sci, Univ Tsukuba, Tsukuba3, Div Biomed Sci, Facult Med, Univ Tsukuba, Tsukuba4, JPSP Research Fellow, Tokyo, Japan5

The lateral habenula (LHb) and the anterior cingulate cortex (ACC) play crucial roles in monitoring negative feedback. To investigate whether and how these structures cooperate in the learning process, we compared their neuron activities in a monkey performing a reversal learning task. While the monkey was gazing a fixation point, two saccadic targets were presented on both the left and the right sides of the fixation point. The monkey was required to choose one of them. Saccade to one direction was followed by reward with 50% probability, while saccade to the other direction was not. After 20 to 30 trials, the rewarded direction was reversed without any instruction. The monkey learned to choose the rewarded direction by trial and error, and adaptively changed his choice based on the past reward history. Thus, the monkey chose the different direction with high probability as choosing the same direction was repeatedly followed by no-reward. We recorded the activity of 38 LHb and 159 ACC neurons. Of these, 37 LHb and 91 ACC neurons showed a significant response to the feedback (reward or no-reward), though a majority (LHb, 37/37; ACC, 54/91) was more strongly activated by no-reward. We found that the no-reward activation was influenced by past reward history in the ACC, but not in the LHb. The no-reward activation in the ACC was enhanced as no-reward trials were repeated, suggesting a parallel between the ACC signals and the monkey's choice behavior. The no-reward activation started later in the ACC than in the LHb. Furthermore, many neurons in the LHb and ACC responded to the fixation point. Only for the ACC neurons, this response was affected by the forthcoming decision as to whether the monkey might choose the left or right target. Notably, this "pre-choice" response was enhanced in the trials in which the monkey updated his choice from the previous trials. Our findings suggest that ACC neurons signal the past information relevant to learning.
P3-2-53
行動決定課題遂行における報酬価値情報処理に関連したアカゲザル眼窩前頭皮質のニューロン活動
Single neuronal activity in the monkey orbitofrontal cortex related to reward value processing during the decision-making schedule task

○瀬戸川剛1,2, 水挽貴至1,3, 稲葉清規1,2, 秋澤文香1, 松本有央4, 設楽宗孝1,3
○Tsuyoshi Setogawa1,2, Takashi Mizuhiki1,3, Kiyonori Inaba1,2, Fumika Akizawa1, Narihisa Matsumoto4, Munetaka Shidara1,3
筑波大院・人間総合科学1, 日本学術振興会2, 筑波大学・医学医療系3, 独立行政法人産業技術総合研究所ヒューマンライフテクノロジー研究部門4
Grad. Sch. of Comprehensive Human Sci., Univ. of Tsukuba, Tsukuba, Ibaraki, Japan1, JSPS Res. Fellow2, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan3, Human Tech. Res. Inst., AIST, Tsukuba, Japan4

In our daily life, we often choose one item or action from several alternatives by considering their values and efforts to obtain them. To know the mechanism of such decision-making process, we developed a decision-making schedule task to obtain a reward and recorded single neuronal activity from monkey orbitofrontal cortex (OFC) which has been reported to be one of the important brain area for reward-guided behaviors.The monkey was initially trained to perform a reward schedule task. In this task, the monkey had to complete the schedule composed of 1, 2 or 4 trials of visual discriminations to earn 1, 2 or 4 drops of liquid reward. After the monkey learned this task, the decision-making schedule task was introduced. The decision-making schedule task was consisted of the decision-making part and the reward schedule part. In the decision-making part, two kinds of choice target were presented sequentially at the center of the computer monitor (these targets called first and second target, respectively). Brightness and length of the choice targets were proportional to amount of liquid reward (1, 2 or 4 drops) and required number of the visual-discrimination trials (1, 2 or 4 trials) to be performed, respectively. After choice targets were presented sequentially, these two targets simultaneously reappeared on both sides of the fixation point in random order. Then the monkey was required to choose one of the two choice targets by touching the corresponding bar in the chair. Following a choice of one target, the chosen reward schedule task was started. We recorded single neuronal activity in the monkey's OFC during the decision-making schedule task. Over 60% of the recorded neurons increased their firing in the first target presented period. Some neuronal activity was modulated by the difference of values between the two choice targets. This result suggests that OFC neurons play an important role in the decision-making by reward value information processing.
P3-2-54
病的賭博における「深追い」;「埋没費用効果」を用いたfMRI研究
IMAGING PATHOLOGICAL GAMBLERS' “CHASING”; fMRI STUDY USING “SUNK COST EFFECT”

○川田良作1, 藤本心佑1, 鶴身孝介1, 横山如人1, 村井俊哉1, 高橋英彦1
○Ryosaku Kawada1, Shinsuke Fujimoto1, Kosuke Tsurumi1, Naoto Yokoyama1, Toshiya Murai1, Hidehiko Takahashi1
京都大学大学院医学研究科脳病態生理学講座精神医学教室1
Department of Neuropsychiatry, Graduate School of Medicine, Kyoto University, Kyoto, Japan1

AbstractBackground Pathological gambling (PG) has clinical features and neural underpinnings that overlap with substance dependence, and those disorders are planned to be categorized together as "substance use and addictive disorder" in the upcoming DSM-5. In DSM-4, the only criteria that differentiate (PG) from substance dependence, is "Chasing; after losing money gambling, often returns another day to get even". We interpreted this phenomenon as a "Sunk Cost Effect" which is a concept used in field of behavioral economics. "Sunk Cost Effect" is a greater tendency to continue an endeavor once an investment in money, effort, or time has been made.Method 21 PG and 25 gender matched healthy controls (HC) underwent functional magnetic resonance imaging to evaluate brain activity when decision making is influenced by sunk cost.Result PG showed no significant behavioral difference in the task. When making decision under sunk cost effect, PG recruited left temporo-parietal junction (TPJ), while HC recruited left medial prefrontal cortex (MPFC). Discussion When making decision under sunk cost effect, counterfactual thinking or mentalizing may be underlying this process. During this, HC may mentalize about long term goals by activating MPFC while PG mentalize more temporal (not long-term) goals or intentions by recruiting TPJ.
P3-2-55
報酬学習および行動柔軟性における大脳基底核回路特異的役割
Pathway-specific, dopamine receptor subtype-specific control of nucleus accumbens in reward learning and its flexibility

○矢和多智1, 山口隆司1, 檀上輝子1, 疋田貴俊1,2, 中西重忠1
○Satoshi Yawata1, Takashi Yamaguchi1, Teruko Danjo1, Takatoshi Hikida1,2, Shigetada Nakanishi1
大阪バイオサイエンス研究所1, 京都大学・医・メディカルイノベーションセンター2
Department of Systems Biology, Osaka Bioscience Institute, Suita1, Medical Innovation Center, Kyoto Univ. Grad. School of Med2

In the basal ganglia, inputs from the nucleus accumbens (NAc) are transmitted through the direct and indirect parallel pathways and control reward-based learning. In the NAc, dopamine (DA) serves as a key neuromodulator in both pathways. This study aimed at exploring how reward-based learning and its flexibility are controlled in a pathway-specific and DA receptor-dependent manner. To address the pathway-specific control of reward learning, we used reversible neurotransmission blocking (RNB), in which transmission of the direct (D-RNB) or the indirect pathway (I-RNB) in the NAc on both sides of the hemispheres was selectively blocked by transmission-blocking tetanus toxin. To address the pathway-specific DA receptor function, we used asymmetric RNB technique, in which transmission of the direct (D-aRNB) or the indirect pathway (I-aRNB) was unilaterally blocked by RNB techniques and the other intact side of the NAc was pharmacologically manipulated by injection of DA receptor agonists or antagonists. Reward-based learning was assessed by measuring goal-directed learning ability based on a visual cues task (VCT) or a response direction task (RDT) in the plus maze task. Learning flexibility was then tested by switching from a previously learned VCT to a new VCT or RDT. The D-RNB mice and D1 receptor antagonist-treated D-aRNB mice showed severe and comparable impairments in learning acquisition but normal flexibility to switch of the previously learned strategy. In contrast, the I-RNB mice and D2 receptor agonist-treated I-aRNB mice showed normal learning acquisition but severe impairments not only in the flexibility to the learning switch but also in the subsequent acquisition of learning a new strategy. D1 and D2 receptors thus play distinct but cooperative roles in reward learning and its flexibility in a pathway-specific manner.
P3-2-56
報酬駆動型 STDP による意思決定の学習モデル
A Learning model of decision making with reward-driven STDP

○根来哲也1, , 深井朋樹1,3
○Tetsuya Negoro1, Matthieu Gilson2,3, Tomoki Fukai1,3
東京大学 新領域創成科学研究科 複雑理工学専攻1, 理化学研究脳科学総合研究センター2
Dept. of Complexity Science and Engineering, Grad. School of Frontier Science, The Univ. of Tokyo, Tokyo, Japan1, RIKEN Brain Science Institute2, CREST3

Reward signals play an important role in behavioural learning. With cell biological view, it is suggested that reward signals work as neuromodulator which affect Spike-Timing-Dependent window. Nevertheless, it's not made clear how modulation of STDP affect the learning of decision making, and ability of reward-modulated STDP to explain the stochastic phenomenons in behavioural learning, such as the matching law. Then we try to construct and investigate a learning model of decision making, whose learning rule obeys STDP rule modulated by reward signals.As a correspondig work, Izhikevich has proposed a learning rule of reward-modulated STDP model(Izhikevich, 2007). We use this learning rule and simulate a instrumental conditioning experiment on 2-input 2-output feed-forward neural network. In Izhikevich's model, additive STDP with weight boudary and no weight dependence is used, but we use log-STDP model whose STDP window has weight dependence (Gilson and Fukai, 2011). Log-STDP model generates the distribution more similar to the weight distribution of real neurons, and thought to be more biologically correct model. We also estimated the effect of triplet STDP model(Pfister and Gerstner, 2006) on decision making.
P3-2-57
自由選択課題時における線条体の階層的行動表現
Hierarchical coding in the striatum during a free-choice task

○伊藤真1, 銅谷賢治1
○Makoto Ito1, Kenji Doya1
沖縄科学技術大学院大学 神経計算ユニット1
Neural Computation Unit, OIST, Okinawa, Japan1

The striatum is a major input site of the basal ganglia that takes an essential role in decision making. Recent imaging and lesion studies have suggested that the subareas of the striatum have distinct roles. We recorded neuronal activities from the dorsolateral striatum (DLS), the dorsomedial striatum (DMS), and the ventral striatum (VS) of rats during a free-choice task, where a rat was required to perform a nose-poke to either the left or right hole after the offset of the cue tone. Our previous analysis found that each striatal neuron in all subareas had a different activity pattern (multiple activity peaks for different task events), and it was modulated by the various information related to decision making, such as, selected actions, reward outcome, and action values. However, it is still unclear what the activity patterns of individual neurons encode. In this study, we proposed a hypothesis that the population activity of the striatum neurons encodes which task phase a rat is currently in. To test this, we decomposed a trial into seven task phases, such as, the duration of cue tone presentation and the duration from the offset of the cue tone to left or right nose-poke. We tried to predict task phases from the population activity of simultaneously recorded neurons in each subarea by using a multiclass logistic regression model. The averages of prediction accuracy from two neurons were slightly above the chance level (1/7=0.14) for all subareas (when N=2, DLS 0.20, DMS 0.22, VS 0.19). But the prediction accuracy from DMS neurons, but not DLS or VS neurons, dramatically improved when the number of neurons was increased (when N=5, DLS 0.29, DMS 0.49, VS 0.25). These results suggest that while higher-level information related to decision making is represented in the entire striatum, finer-scaled information related to the task phases is simultaneously represented in DMS.
P3-2-58
意識に昇らない視覚刺激による行動の最適化過程
Process of acquiring optimal behaviors by visual cue stimulus without awareness

○加藤利佳子1, 高桑徳宏1,2伊佐正1,2
○Rikako Kato1, Norihiro Takakuwa1,2, Abdelhafid Zeghbib3, Peter Redgrave3, Tadashi Isa1,2
生理学研究所 認知行動発達機構研究部門1, 総合研究大学院大学2, シェフィールド大学3
Dept Dev. Physiol. Nat Inst. Physiol. Sci, Okazaki1, The Graduate Univ for Advanced Studies, Hayama,2, Dept Psychol, Univ of Sheffield, Sheffield, United Kingdom3

Decision-making based on value judgments determines the extent to which particular behaviors are optimal for survival. However, in face of changing environments we continually have to learn new optimal behaviors. Instrumental conditioning is a form of reinforcement learning where new behaviors are acquired on the basis of how valuable their outcomes are. To study whether subjective visual awareness is essential for this type of reinforcement learning must be interesting for understanding adaptive behaviors in daily life.In this study, we used the monkeys with unilateral primary visual cortex (V1) lesion as an animal model of 'blindsight'. 'Blindsight' is a phenomenon in which patients with damage to V1 retain an ability to localize visual stimuli by eye movements, but without visual awareness. Our study analyzed process of reinforcement learning by presenting a reinforcing visual cue to the monkey's affected visual field. We designed a special search task in which the monkey had to use eye movements to search for a hidden area on a blank screen. A secondary reinforcing visual stimulus was used as a cue to inform the subjects that they had moved their eyes into the hidden ('hot') area (HA). Each trial was initiated after the monkeys fixated their eyes at a randomly indicated position in the search space.Significant reductions in search time were found, even when cue stimuli were presented in affected visual field. Incidental capturing of the HA elicited cue-presentation and delivery of reward. After that, monkeys tended to repeat the same behavior immediately before the cue presentation. The eye positions tended to stay near the start position of the last saccade before cue-presentation. After trial and error, saccades from the area were fitted to the movement with optimal direction and optimal amplitude. These results show that a visual stimulus in the affected field can contribute as a cue for judgment of each behavior and effectively reinforce novel behavior.
P3-2-59
心理物理と不確実性下の意思決定におけるアノーマリ
Psychophysics and the anomaly in decision under risk

○韓若康1
○Ruokang Han1
北海道大学・文学研究科・人間システム講座1
Dept Behav Sci, Univ of Hokkaido, Hokkaido1

People generally discount or devaluate delayed and uncertain rewards. One proposition is that decision over time and under risk share the same underlying mechanism thus both, in behavioral terms, are better described by a hyperbolic discounting. In neo-classic economic theories, the hyperbolic discounting indicates deviations from the normative models thus are considered as anomalies (i.e. time inconsistency and certainty effect, respectively). Takahashi (2005) proposed time-based account to solve hyperbolic time discounting by applying nonlinear psychophysical time to the time discount model. However, it is still unknown whether time-based account (psychophysical time) can account for hyperbolic probability discounting (i.e. certainty effect). This study aims to examine whether introducing psychophysical time can reduce hyperbolicity in probability discounting the same as in time discounting. Our results demonstrated that psychophysical time for delayed and uncertain rewards may explain aforementioned anomalies in intertemporal and probabilistic choice respectively.
P3-2-60
基底核および小脳における時間情報の神経表現
Neuronal representation of temporal information in the basal ganglia and the cerebellum

○國松淳1, 大前彰吾1,2, 田中真樹1
○Jun Kunimatsu1, Shogo Ohmae1,2, Masaki Tanaka1
北大・医・神経生理学1, ペンシルバニア大学2
Department of Physiology, Hokkaido University School of Medicine, Sapporo, Japan1, Department of Psychology, University of Pennsylvania, Philadelphia, USA2

Two subcortical structures, the basal ganglia and the cerebellum, are known to be implicated in temporal processing. It has been hypothesized that the basal ganglia play a role in the monitoring of elapsed time in the range from a second to minutes, while the cerebellum is essential for the processing of sub-second interval. However, this hypothesis remains elusive because it stems from a variety of observations in different species and behavioral paradigms. Furthermore, no previous study directly compared neuronal activities in the relevant structures using the tasks requiring a range of temporal processing.To examine the underlying neuronal mechanisms, we recorded from neurons in the dentate nucleus and the caudate nucleus while monkeys performed the self-timed saccade task. In this task, the animals were required to reproduce three mandatory intervals of 300, 900, or 2100 ms and make a saccade without any immediate external trigger. Many neurons in both structures exhibited a gradual ramp-up of firing rate prior to self-timed saccades. Neurons in the dentate nucleus (n = 66) consistently exhibited a gradually increasing activity 400-800 ms prior to saccades, irrespective of the length of the intervals to be reported. In contrast, neurons in the caudate nucleus (n = 12) started to modulate the firing with a fixed delay following the cue, but showed different time courses of activity depending on the length of the reported time intervals. Although neurons in both subcortical structures were generally active during a range of temporal processing, the time courses of neuronal activity suggest that they might differently contribute. The signals in the cerebellum may be usable for fine adjustment of movement timing, and therefore might play a greater role in self-timed movements for a short period of time. The signals in the basal ganglia may keep track of time in the range > 500 ms, and may regulate the timing of decisions for self-initiated movements.
P3-2-61
サルの視覚刺激に対する選好性:行動学的研究
Monkeys' preference for visual items: behavioral study

○中本若奈1, 大西ひかり1, 船橋新太郎1
○Wakana Nakamoto1, Hikari Onishi1, Shintaro Funahashi1
京都大学 こころの未来研究センター1
Kokoro Research Ctr, Kyoto Univ, Kyoto1

We prefer some items but not others. Preference for items is different from person to person. The aim of the experiment is to determine what parameters determine preference for items and what extent its individual difference is present and examine what neural mechanism participates in this process. Since preference for the item could be depend on particular physical properties of the item, in the present study, we behaviorally examined effects of physical properties (color, shape, glossiness, and coarseness) of the item on its preference using monkeys. We used 50 photographs obtained from the FMD database as stimuli. These included photographs of cloths, papers, glasses, metals, plastics, rocks, and water. We also prepared black-and-white and modified-coarseness versions of these photographs. First, one stimulus and its modified version were simultaneously presented on the monitor and 4 monkeys were asked to choose one stimulus by eye movements. Second, 2 stimuli were selected randomly from 50 stimuli and simultaneously presented on the monitor. 4 monkeys were asked to choose one stimulus by eye movements and look at the chosen stimulus for up to 6 sec. We considered the chosen stimulus as monkeys' preferred stimulus if monkeys could look at this for that period. Monkeys exhibited different chosen ratios to different stimuli. Therefore, the chosen ratio of each stimulus is considered an index representing the strength of the preference for the stimulus. As a result, clarity of the stimulus or the strength of spatial frequency components of the stimulus was shown to be the most important parameter for determining preference for stimuli used in the present experiment. Color, glossiness, and material are not much effect for preference.
P3-2-62
ショウジョウバエ幼虫におけるセロトニン作動性神経回路の方向転換行動パターン制御
Serotonergic neural circuits regulate the pattern of turning behavior in Drosophila larvae

○奥沢暁子1, 高坂洋史2, 能瀬聡直1,2
○Satoko Okusawa1, Hiroshi Kohsaka2, Akinao Nose1,2
東京大院 理・物理1, 東京大院 新領域・複雑理工2
Dept. of Physics, Grad. Sch. of Sci., Univ. of Tokyo, Tokyo, Japan1, Dept. of Complexity Sci. and Eng., Grad. Sch. of Frontier Science, Univ. of Tokyo, Chiba, Japan2

Most animal behaviors arise as a combination of constituent movements. How animals choose an appropriate combination of the behavioral components according to the external and internal condition is poorly understood. In this work, we studied components of turning behavior in Drosophila larvae and identified serotonergic neural circuits that regulate the choice of a component during the behavior.
During locomotion, the larvae move forward in a straight line by generating a series of peristaltic movements, but sometimes turn to expand the exploring area or to free from potential dangers. To classify and quantify components of the turning behavior, we constructed an experimental set-up in which we can efficiently induce the behavior by applying strong blue light stimulation at the head of the larvae. We found that the turning behavior is composed of three components: bending, retreating and rearing. Combination and number of the three components during turning were variable, suggesting that the behavior can be generated by various combinations of the three components. While one or more bending was always included in the combination, rearing and retreating were seen less frequently.
We found that serotonin regulates the pattern of the combination by specifically inhibiting one of the components, rearing. First, blocking serotonergic neurons increased the frequency of rearing without affecting bending and retreating. Second, applying a serotonin agonist reduced the frequency of rearing. Finally, reduction in a serotonin receptor, 5-HT1BDro increased the frequency of rearing. Furthermore, we identified a class of downstream neurons, which expresses the serotonergic receptor and regulates the frequency of rearing. These neurons expressed a neural peptide, leucokinin, as a transmitter. Taken together, we showed that neural circuits including in series the serotonin and leucokinin system regulate the choice of components during larval turning behavior.
P3-2-63
リスクを伴う意思決定における島皮質前部の神経表現
Neural representation of risky decision making in rat anterior insular cortex

○石井宏憲1, 高橋勝平1, 大原慎也1筒井健一郎1, 飯島敏夫1
○Hironori Ishii1, Shohei Takahashi1, Shinya Ohara1, Philippe N. Tobler2, Ken-Ichiro Tsutsui1, Toshio Iijima1
東北大院・生命・脳情報処理1
Div. Sys. Neurosci., Tohoku Univ., Sendai1, Dept. Econ., Univ. of Zurich, Zurich2

Anterior insular cortex (AIC) has been considered as one of the brain regions that play a central role in the choice of whether to take or avoid the risk. Human imaging studies have reported activations of the AIC in various gambling tasks (Paulus et al., 2003; Kuhnen and Knutson, 2005; Preuschoff et al, 2008; Xue et al, 2010). Previously, we showed causal role of the AIC in risky decision making that inactivation of AIC decreased the rat's risk preference in gambling tasks (Ishii et al., 2012), suggesting that the AIC normally promotes risk seeking behavior. To test this suggestion, here, we recorded multiple single unit activities of the AIC during the performance of the gambling task which we had used in previous study. The basic task was to get water by choosing one of two levers which were associated with a risky option (4drops or no water, 50%) and a sure option (2 drops of water) respectively. The rats were surgically implanted with 8 wire electrodes arrays attaching to micro-drives into bilateral AIC. One hundred and five neurons in the AIC were recorded from three rats. Twenty-seven neurons showed significant changes of firing rate before their choice (during 1000ms before lever press) from baseline (firing rate during nose-poke holding to start a trial). Among them, 5 neurons significantly increased and 4 neurons decreased their firing rate when the rats chose the risky option comparing to when the rats chose the sure option. Meanwhile, no neuron showed higher increased firing rate in the sure choice than in the risky choice, and two neurons significantly decreased their firing rate in the sure choice than in the risky choice. In population of all recorded 105 neurons, the change rate from baseline to choice period was significantly higher in the risky choice than in the sure choice. These results strongly support that the AIC is involved in risky seeking behavior.
P3-2-64
課題条件に応じた行動選択時のラットの前頭前野と背側線条体の役割
Role of prefrontal and striatal neurons in task-condition-dependent action selection of rat

○船水章大1,2, 伊藤真1, 銅谷賢治1, 神崎亮平3,4, 高橋宏知3,4
○Akihiro Funamizu1,2, Makoto Ito1, Kenji Doya1, Ryohei Kanzaki3,4, Hirokazu Takahashi3,4
沖縄科学技術大学院大学1, 日本学術振興会 特別研究員PD2, 東京大学 先端科学技術研究センター3, 東京大学 情報理工学系研究科4
Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan1, JSPS Research Fellow PD, Tokyo, Japan2, RCAST, University of Tokyo, Tokyo, Japan3, Graduate School of Information Science and Technology, University of Tokyo, Tokyo, Japan4

Despite widespread neural coding of actions, rewards and expected reward values in the cortico-basal ganglia network, the effect of task conditions on these coding in a reward-based action selection is still unclear. Here, we conducted a choice task consisting of a random sequence of fixed-reward and variable-reward trial in 5 rats and recorded the activities of 23 prelimbic and 43 striatal neurons.
In the choice task, rats were asked to nose-poke to a left or right hole, and received a reward stochastically. To differentiate the reward condition of each trial, a light stimulus was presented only in fixed-reward trials. While the reward probability was fixed in fixed-reward trials, it was changed among 4 settings in variable-reward trials when the choice frequency of more rewarding hole reached 80%.
In variable-reward or fixed-reward trials, rats explored or constantly selected a more rewarding hole, respectively, indicating that rats distinguished the trial condition. The choice sequences were analyzed with reinforcement learning models; an interactive value-updating model, in which the action values in variable-reward trials were updated with both the experiences in variable- and fixed-reward trials, better fit the behaviors, compared to the conventional models with independently updating values in each condition (paired t-test, p = 4.46E-30).
Regression analyses of neural activities showed that, during reward/no-reward cues, 60.9% prelimbic neurons changed the activities depending on both trial conditions and rewards, while only 11.6% striatal neurons changed the activities (χ2-test, p = 2.55E-5). After the reward/no-reward cues, 43.5% prelimbic neurons showed both actions and rewards depended activities, while the significantly larger proportion of striatal neurons (81.4%) showed the activities (χ2-test, p = 0.00163). These results suggest that the condition-reward and action-reward associations are represented in prelimbic cortex and striatum, respectively.
P3-2-65
サル視床下部外側野ニューロンの神経活動は嫌悪刺激の導入により抑制される
Inclusion of an aversive option in the task attenuates responses to the conditioned and unconditioned stimuli in the primate lateral hypothalamus area

○則武厚1, 中村加枝1,2
○Atsushi Noritake1, Kae Nakamura1,2
関西医科大学・第二生理1, JSTさきがけ2
Physiol., Kansai Med. Univ., Osaka, Japan1, PRESTO, Saitama, Japan2

The value of a reward may change depending on circumstances. For example, if a punishment is included as a possible outcome, the same reward may be valued higher relative to the punishment. On the other hand, the reward may be devaluated under the risk of danger. We have previously shown that neurons in the lateral hypothalamic area (LHA) convey information mainly of reward probability, uncertainty, and predicted values, but not of punishments. In the study, however, because rewards and punishments were tested in separate blocks of trials, it remains unclear how coding of rewards is influenced when both positive and negative outcomes may occur. We therefore analyzed LHA neuronal activity in two monkeys that were conditioned with a Pavlovian trace procedure. In the task, we manipulated a contingency between conditioned stimulus (CS) and unconditioned stimulus (US) in three distinct context: the appetitive (APP), aversive (AVE), and bivalent (BIV) blocks. In APP block, either a liquid reward or a tone; in AVE block, an air-puff or a tone; in BIV block, a liquid reward or an air-puff was used as US, respectively. In each block, there were cued and uncued trials. On the cued trials, an alarming cue was presented which was followed by one of the three visual CSs, each of which signaled the probability of USs (0, 50, and 100%). After the CS disappeared, there was a 1.0 s delay followed by US. On the uncued trials, the US was delivered unpredictably. As reported previously, among 246 task-related neurons, many neurons showed graded responses to CS (n=114) or US (n=116) depending on the outcome probability or predictability in APP and BIV blocks. Moreover, the responses to CS or US were significantly attenuated in BIV block than in APP block. Thus, inclusion of the aversive outcome attenuated LHA neurons' response to reward. The results indicate that the value representation in the LHA is modulated by the context, such as possible options of the outcomes.
P3-2-66
過学習時においてサル中脳ドーパミン細胞は超長期的な報酬期待を表現する
Midbrain dopamine neuron signals supra-long term, minimum-discounted future rewards in over-trained monkeys

○榎本一紀1, 松本直幸2, 木村實1
○Kazuki Enomoto1, Naoyuki Matsumoto2, Minoru Kimura1
玉川大学 脳科学研究所1, 熊本県立大学 環境共生学部2
Brain Sci Inst, Tamagawa Univ, Tokyo1, Fac of Environmental and Symbiotic Sci, Pref Univ of Kumamoto, Kumamoto2

In a familiar and stable environment, it is economical to process events indifferently with long-term prediction of future actions and rewards. Midbrain dopamine (DA) neurons have been proposed to play critical roles in learning by encoding reward value and its prediction error. We reported previously that DA neurons learned to encode the long-term value of multiple future rewards in monkeys. However, it is unclear about the DA signals when monkeys are over-trained and can predict rewards not only over individual steps but also over blocks of multiple steps. In this study, monkeys learned a choice task which consisted of sub-blocks of multiple rewarding steps. We studied DA activities after the monkeys had learned the task for 2-3 months. In advanced stage of learning, duration of anticipatory licking of a spout of reward pipe, as an index of reward expectation, differentiated each step in a manner of long-term reward value estimated by the TD learning algorithm. When monkeys learned the steps and blocks of choices for more than 50 days, however, the lickings became indifferent to each step. Consistently, DA responses to the task-start cue evolved to reduce differences between steps: response magnitude in the step with the lowest reward probability was not significantly different from that in the step with the highest probability. We observed the same tendencies of DA responses to CSs in the classical conditioning paradigm in which reward probability increased step by step and was thus predictable. The responses were still modulated by motivational state estimated by the reaction time to start trials. Moreover, DA responses to reinforcers (action outcome or US) represented reward prediction errors faithfully. These results suggest a novel role of DA in a well-learned environment: signaling event-indifferent, supra-long-term (minimum discounted) reward information that facilitate decisions over a longer time scale, such as minutes, hours, days and even more.
P3-2-67
複数の特徴次元に由来する視覚的顕著性信号の計算様式
Neuronal calculation for the singleton-target saliency defined by multiple feature dimensions during visual search of macaque monkeys

○大渕藍1, 田中智洋1, 小川正1
○Ai Obuchi1, Tomohiro Tanaka1, Tadashi Ogawa1
京都大学大学院 医学研究科 認知行動脳科学1
Dept of Integrative Brain Science,Grad Sch of Med, Kyoto Univ1

A "singleton" object differing from its surroundings in basic stimulus features is intrinsically conspicuous (salient). Generally, it is known that a singleton target is discriminated with a shorter reaction time when it differs in multiple features (e.g. a red circle in green crosses) than in single feature (e.g. a red circle in red crosses). However, it is still unclear how the neural signals from different features are processed and contribute to speed-up the target discrimination in visual search. To provide insights into the underlying neural mechanisms, we trained monkeys to search for a singleton target in a search array, and measured the reaction time of the saccade made to that target. The target was unique from distractors in shape and/or color. The first purpose of the present study was to test whether the speed-up of target discrimination could be explained by either a race model (shape and color feature signals independently race), or an integration model (two feature signals are combined). The second purpose was to evaluate the strength of neural signals for the salient target by analyzing the distributions of saccadic reaction times with the LATER model. We found that the race model cannot explain the reduction in the saccadic reaction time for the multiple-feature singleton target, suggesting that the integration of multiple feature signals may occur at the target-selection and/or upstream stages. However, the secondary analysis with the LATER model revealed that the strength of a saliency signal for the multiple-feature singleton was only 60% (if this rate is 50%, there is no enhancement after the integration) of the sum of those for the single-feature singletons. Thus, the present results demonstrated that saliency signals for a singleton target in each feature dimension can be enhanced by the integration of them, but this integration of the saliency signals is not perfect but rather incomplete.
P3-2-68
中脳ドーパミン細胞の報酬予測誤差信号に対するコストの影響
The effect of cost on the reward prediction error signal in midbrain dopamine neurons

○田中慎吾1, , 坂上雅道1
○Shingo Tanaka1, John P O'Doherty2, Masamichi Sakagami1
玉川大学 脳科学研究所1, カリフォルニア工科大学2
Brain Science Institute, Tamagawa University, Tokyo1, California Institute of Technology, USA2

"One of the greatest joys in life is an ice cold beer after a hard day of work." This phrase indicates that the value of a reward can be modulated by the effort or the cost put into its attainment. Some studies have shown that the values of options, which integrate the related costs and rewards, are represented in the PFC. However, few studies have shown the neural basis of the effect of cost on the value of the reward. Additionally, it is still unclear how cost information is processed in the basal ganglia to calculate the values of options and rewards.
When calculating the values of options in the basal ganglia, the activity of dopamine neurons, which represent reward prediction error, is used to update the value. If the cost signal is transmitted to the dopamine neurons, then it is possible that cost affects the activity of the dopamine neuron. Here, we examined whether the activity of dopamine neurons were modulated by cost.
Two monkeys performed a saccade task. After fixation on a fixation point, the subjects were required to make a saccade to a condition cue and then a target appeared. In the high cost condition, longer fixation to the target was required. After the fixation on the target, the subjects made a saccade to the reward cue.
The subjects preferred the low-cost condition to the high-cost condition. In contrast to this preference, they preferred reward cues after high-cost to reward cues after low-cost. While the subjects performing the saccade task, the activity of dopamine neurons were recorded from SNc in the midbrain. The dopamine neurons showed phasic response to the condition cues and the reward cues. The response to the low-cost cue was larger than that to the high-cost cue. Contrarily, the responses to the reward cue after the high-cost condition were larger than that to the reward cue after the low-cost condition. From these results, we suggest that information about the cost is transmitted to the dopamine neurons for value calculation.
P3-2-69
行動選択におけるラット前頭前野の多細胞集団活動の挙動
Neural ensemble dynamics in rat dorsomedial prefrontal cortex during a sensory-guided choice task

○半田高史1, 竹川高志1, 春国梨恵1, 礒村宜和2, 深井朋樹1
○Takashi Handa1, Takashi Takekawa1, Rie Harukuni1, Yoshikazu Isomura2, Tomoki Fukai1
理研・脳センター・脳回路機能理論研究チーム1, 玉川大・脳科学研究所2
Neural Circuit Theory, RIKEN BSI, Saitama1, Brain Science Instit, Tamagawa Univ, Tokyo2

Rodent dorsomedial prefrontal cortex (dmPFC), also called as secondary motor cortex, is one of candidates for spatial processing underlying action decision according to sensory information. Especially, it remains unclear how single neurons as well as neural ensemble in dmPFC contribute to directing action according to perception of familiar sensory cues as well as unfamiliar (novel) sensory cues. We recorded ensemble activity from dmPFC of head-restrained rats performing a choice task, in which they lick one of spatially distinct alternates according to familiar and novel sensory cues. We obtained 283 putative regular spiking (RS) neurons and 41 putative fast spiking interneurons (FS) in total, of which 200 neurons (RS 165, FS 35) exhibited event-related increases in firing rate. Single neuronal activity was modulated by cue presentation and/or by response execution. About half of them (RS: 86/165, FS: 21/35) responded specifically to upcoming choice. Neural activity in part of the choice-selective neurons was modulated by cue tones. We found concomitant activation with choice specificity and cue tone modulations after cue onset until after choice execution. Ensemble firing patterns of neural population simultaneously recorded from dmPFC revealed choice specificity with cue tone modulation when correct choice was made under familiar cue tone condition. Otherwise, those revealed a similar choice specificity, but weakly cue tone modulation in error choices. Under novel cue tone condition, those firing patterns also revealed choice specificity similarly as that under familiar cue tone condition, but more weakly cue tone modulation unlike that in correct choice under the familiar condition. These results suggest that dmPFC neurons process forthcoming choice and can be modulated by perception states at the single and population neuron levels.
P3-2-70
意思決定における報酬と痛みの主観的価値の統合に関するfMRI研究
Integration of appetitive and aversive values in decision making: An fMRI study

○丸山雅紀1, 吉田和子1, 石井信1,2
○Masaki Maruyama1, Wako Yoshida1, Shin Ishii1,2, Ben Seymour3,4
株式会社 国際電気通信基礎技術研究所 脳情報解析研究所1, 京都大学 情報学研究科 システム科学専攻 論理生命学分野 システム情報論講座2, 独立行政法人 情報通信研究機構 脳情報通信融合研究センター3
Advanced Telecommunications Research Institute International, Neural Information Laboratories, Kyoto, Japan1, Integrated Systems Biology Laboratory, Department of Systems Science, Graduate School of Informatics, Kyoto University, Kyoto, Japan2, Center for Information and Neural Networks, National Institute for Information and Communications Technology, Osaka, Japan3, Computational and Biological Learning Laboratory, Department of Engineering, University of Cambridge, UK.4

In our daily life, choices that lead to personal gain often entail some sort of cost. Behavioral studies have provided wealth data indicating an integrative process between positive (gain) and negative (cost) values into a combined subjective value that can guide decisions. However, the neural systems that subserve such integration are not well understood. The current study used fMRI in healthy subjects to investigate how the brain integrates values of opposing valence, using monetary reward and painful shock as gain and cost, respectively. Each trial started with an offer that was a pair of visual cues indicating a subjective pain intensity and an amount of monetary reward. The subjects were instructed to press one of two buttons immediately after they made a decision to accept or reject the offer. In one condition, when the offer was accepted, they received both the painful shock and money with a fifty percent probability. In a second condition, they received either the painful shock or the money. Comparison of the two conditions allowed evaluation of the interaction between decision outcomes. The delivery of painful shock occurred at the end of trials, and the total accumulated reward money was remunerated after the experiment. The behavioral data revealed an integrative decision process with prolonged reaction time as a subjective value of offer was close to zero (i.e. more decision conflict). The result supported an accumulation of integrated value, as expected from a psychological diffusion model, which we examine in detail. fMRI data will be presented that support the corresponding neural basis of the decision process.
P3-2-71
意識に昇らない視覚刺激による連合学習
Classical conditioning reinforced by visual stimuli without awareness

○高桑徳宏1, 加藤利佳子1伊佐正1,2
○Norihiro Takakuwa1, Rikako Kato1, Peter Redgrave3, Tadashi Isa1,2
自然科学研究機構 生理学研究所 認知行動発達機構研究部門1, 総合研究大学院大学2, 英国シェフィールド大学3
Dept Dev. Physiol, Nat'l Inst. Physiol. Sci., Okazaki1, The Graduate Univ for Advanced Studies, Hayama2, Dept Psychol, Univ of Sheffield, Sheffield, United Kingdom3

Animals can learn associations between predicting stimuli and reward. This process is called classical or Pavlovian conditioning. After damage to visual cortex (V1), human patients lose visual awareness. However, some patients can make saccadic eye movements and/or make manual responses to visual stimuli presented in their affected visual field. The phenomenon is called blindsight. Here we asked whether classical conditioning can be performed without subjective awareness. For this purpose we investigated whether a visual cue without awareness can act as a conditioned stimulus (CS) in a classical conditioning task.We used two monkeys with unilateral lesions of primary visual cortex as an animal model of 'blindsight'. We asked whether the monkey can learn the associations between CSs presented in the affected visual field and reward. During the learning task, the monkeys fixated a point in the center of screen, and two visual CSs were presented in the affected visual field. One predicted immediate large reward (CS+ trial), while another predicted a small reward (CS- trial) after a long interval. The cues could be discriminated by their position on the screen. The reward timing was different between CS+ trial and CS- trial, 1.3 s and 1.5 s after the cue onset, respectively.We counted the amount of anticipatory licking to assess whether the monkeys learned the association between the visual CSs and reward. During learning, the monkeys increased anticipatory licking triggered by the visual CSs. Furthermore, the timing of anticipatory licking was different between CS+ trial and CS- trial. After learning, the positions of visual cues in CS+ trial and CS- trial ware altered. Again, the monkey was able to relearn the association between the new position of CSs in the affected visual field and reward.These results suggest that the monkeys were able to learn the predictive value of visual stimuli in a classical conditioning task in the absence of subjective awareness.
P3-2-72
サルの尾状核における嫌悪刺激に関わる情報の表現様式
Representation of the aversive information in the primate caudate

○上田康雅1, 時田賢一1, 中村加枝1
○Yasumasa Ueda1, Kenichi Tokita1, Kae Nakamura1
関西医科大学 第二生理1
Dept Physiol, Kansai Medical University, Osaka1

Studies on the neuronal mechanisms of decision making have emphasized reward and punishment dependent learning as critical in value-driven decision making. Neurons in the striatum, an input channel of the basal ganglia, have been repeatedly shown to exhibit reward-dependent modulation in activity. However, it is not well understood whether and how the striatum is also involved in aversive information processing. To answer the question, we recorded single neuronal activity in the caudate nucleus while a monkey (Macaca fascicularis) performed a choice saccade task that required both acquisition of rewards and avoidance of punishments. In the task, three fractal objects were separately associated with a rewarding drop of juice, a neutral tone, or an aversive airpuff. After fixation on the central fixation point for 1 sec, two out of the three objects were presented as a pair (i.e. reward vs. tone, reward vs. air puff, or tone vs. airpuff) simultaneously in left and right. The animal was then required to choose one of two objects by making a saccadic eye movement.
We found that caudate neurons' activity was indeed influenced by the expected or receipt of the aversive stimulus. Out of 74 task-related neurons, some neurons (n=56) showed an enhancement in post-cue or peri-saccadic activity or in post-outcome activity (n=19) whose magnitude was differently modulated depending on another counterpart of the pair even for the same direction of saccades to obtain the same rewards (i.e. choice of reward instead of tone vs. airpuff). Another group of neurons exhibited an enhancement in post-cue (n=4) or in post-outcome (n=3) activity when the aversive object was not chosen (avoided) regardless of the outcomes (i.e. choice of reward or tone vs. airpuff). These results suggest that the aversive information, integrated with the reward information, is processed in the striatum that would be a neuronal substrate of value-based decision making.
P3-2-73
模倣したさの神経基盤-模倣したさと親密さの密接な関係-
Neural bases of the urge to imitate- Close association between Urge and Familiarity-

○塙杉子1, 杉浦元亮1,2, 野澤孝之3, 蓬田幸人4,5, 事崎由佳3, 秋元頼孝1, 横山諒一1,5, 川島隆太1,3
○Sugiko Hanawa1, Motoaki Sugiura1,2, Takayuki Nozawa3, Yukihito Yomogida4,5, Yuka Kotozaki3, Yoritaka Akimoto1, Ryoichi Yokoyama1,5, Ryuta Kawashima1,3
東北大学・加齢研・脳機能開発1, 東北大学 災害科学国際研究所 災害情報認知研究分野2, 東北大学 加齢医学研究所 スマート・エイジング国際共同研究センター3, 玉川大学 脳科学研究所4, 日本学術振興会5
Dept Functional Brain Imaging, IDAC, Tohoku University, Sendai, Japan1, Dept Disaster-Related Cognitive Science, International Research Institute of Disaster Science2, Smart Ageing International Research Center, IDAC, Tohoku University, Sendai, Japan3, Brain Science Institute, Tamagawa University, Tokyo, Japan4, Japan Society for the Promotion of Science (JSPS)5

Imitation is an inherent ability in humans. Although many studies have focused on human imitation skills, little research has been carried out on the neural mechanisms of spontaneous imitation. Using functional magnetic resonance imaging (fMRI), we investigated the neural bases of "the urge to imitate", which is closely linked to spontaneous imitation. For our study, we prepared about 200 movie clips of different meaningless hand actions. We first created an inventory for the degree of urge to imitate and identified confounding factors based on the preparatory experiment of evaluating the impression of the movie-clips. In addition to the urge to imitate, three confounding factors, "familiarity", "difficulty" (to execute), and "rhythm", were identified. We selected 24 movie clips so that the degree of the urge to imitate was varied. We presented the subjects with the movie clips and the subjects observed and imitated the hand movements during MRI scanning. We searched for cortical regions where the amplitude of neural response correlated with the degree of the urge to imitate or with confounding factors, respectively. We found that the right cingulate motor area (CMAr) showed a significant correlation with the urge to imitate only during the imitation condition (p<0.001, corrected to p<0.05 using the cluster size) (t=4.02 at peak voxel). Our findings determined that the CMAr is crucially involved in the neural bases of the urge to imitate. Furthermore, there were extensive overlapping areas between the urge and familiarity. This fact may, however, suggest a close association of familiarity with the urge to imitate, even for meaningless actions. Additionally, it has been argued that experience may explain better imitation performance for meaningful gestures than for meaningless gestures (Rumiati and Tessari, 2002; Vogt et al., 2007).
P3-2-74
報酬に基づく行動のバイアスは視床CM核―線条体投射によって最適化される
Optimization of reward-based response bias through the CM-striatum projection

○山中航1, 木村實1
○Ko Yamanaka1, Minoru Kimura1
玉川大・脳研1
Brain Sci Inst, Tamagawa Univ, Tokyo1

Humans and animals predict future events, reinforce particular action among others (response bias), while optimizing (either counteract or facilitate) the bias by bottom-up signals, such as when unexpected events occur. The centromedian (CM) nucleus of the thalamus may participate in attention shift and counteracting response bias through the basal ganglia-thalamo-striatal "internal" loop (Kimura et al., 2004).
We identified 3 classes of neurons in the CM nucleus of 2 Japanese monkeys: those responsive to visual and auditory stimuli at a short latency (SLF, n=6), those with sensory responses at a long latency (LLF, n=30), and those non-responsive to external stimuli (NS, n=80). We reported previously about SLF and NS neurons. Here, we show characteristic response properties of LLF neurons. Monkeys performed tasks of instructed choices between button presses followed by large reward and those followed by small reward. Monkeys could bias on either one of two options before instruction came, because action-reward combinations were fixed for a block of 60-80 trials. LLF neurons as a whole responded robustly to action-reward instructions. Remarkably, they exhibited large responses to instruction of small-reward option when the level of response bias was high, whereas they showed large responses to instruction of large-reward option under low bias.
Then, we examined which of the 3 classes of CM neurons project to the striatum electrophysiologically by stimulating the putamen and recording in CM. We identified 4/13 (31%) of LLF and 0/9 of NS neurons project to the putamen.
These results support a hypothesis that signals of LLF neurons transmitted from CM thalamus to the putamen may optimize reward-based response bias when unexpected events occur.
P3-2-75
Distinctive neural representations induce respective behavior and dopaminergic activity of sign- and goal-tracking rats
○Sivaramakrishnan Kaveri1,2, Hiroyuki Nakahara1
Integrated Theor. Neurosci. Lab., RIKEN BSI1, Dept of Computat. Intelligence and Syst. Sci., Tokyo Inst. of Technol.,2

Phasic dopamine (DA) activity is a major modulator of motivated and reward-oriented behavior, but its functional role is widely debated. One view assigns the role of reinforcement learning (RL) signals to DA activity as the reward prediction error (RPE), whereas another view argues for a role of Pavlovian incentives so that DA activity imputes conditioned stimulus (CS) with control over behavior. Recently, Flagel et al. (2011) presented evidence apparently supporting the second view; their sign- and goal-tracking (ST and GT) rats had differential DA activity that was dominant at the CS and unconditioned stimulus (US) when approaching the CS and US, respectively, for which they considered that both rats should have no RPE or no DA activity for an US under the RL view. However, here we present a novel "dual-representation" hypothesis, using RL model simulations to demonstrate that DA activity represents RPE in both types of rats. Our RL models were formulated with two representation systems that underlie value computation and are associated with different conditioned responses, i.e., toward the CS and US, respectively. The value is learned with RPE, but each system inevitably competes for the value relative assignment. When we assumed the variability of each system's strength in individuals, the simulation results indicated that the competition leads to the emergence of differential CS and US approaches, with corresponding ST-like and GT-like DA activity, respectively. Our RL model distinguishes reward prediction from response generation and shows that incentive attribution, seemingly distinct between ST and GT rats, is rather a product of distinct neural representations underlying reward prediction. These results suggest a critical role of neural representations for conditioned behaviors and provide a resolution of the aforementioned two views on DA function.
P3-2-77
Neural circuits in controlling paternal parental behavior in male ICR mice following communicative interaction with maternal mate
○Shirin Akther1, Chiharu Higashida1, Azam AKM Fakhrul1, Mingkun Liang1, Jing Zhong1, Haruhiro Higashida1
Kanazawa University1

Appropriate parental care by the father can greatly facilitate healthy family life. Fathers play a substantial role in infant care in a small but significant number of mammalians including humans. However, the neural circuitry controlling paternal parental behavior is much less understood than its female counter part. To support the wellbeing of the parent-infant relationship, the neuromolecular mechanism of paternal behavior should be clarified. Laboratory (ICR strain) mice are very active in reproduction (Jin et al., Nature, 2007), but are not monogamous. ICR males are not spontaneously parental and can induce maternal-like parental care (retrieval of pups), when separated from their pups by signals (olfactory and auditory) from the mother (Nature communications, Liu et al.,2013). Here we studied neuronal circuits that are important in paternal parental care. In order to characterize brain areas activated by paternal parental care, ICR wild-type male or female mice in male-female pairs were given an electrolytic brain lesion, a useful tool to disrupt maternal parental care, in the medial preoptic area (MPOA) or ventral pallidum (VP) region of both sides. We found that the lessoned males and females showed sever deficits in all components of parental behavior, including retrieval behavior compared to the control group with no electrical brain lesion and sham control group. In addition, we observed the higher level of aromatase expression in these brain regions of sires following communicative interaction with maternal mates. Our result suggests that these areas play a critical role in paternal parental behavior in male ICR mice. The result well accords with previous observations, in that MPOA and VP are critical in rat mothers for the expression of maternal behavior and protective voluntary maternal response.
P3-2-78
Aversive Responses in the Lateral Habenula of the Freely Behaving Mouse
○Roman Boehringer1, Kazue Niisato1, Denis Polygalov1, Hitoshi Okamoto1, Thomas J. McHugh1
RIKEN Brain Science Institute1

The lateral habenula has been suggested to play a role in the 'anti-reward' circuit of the brain. To better understand how this region encodes aversive stimuli in the rodent, we recorded from the lateral habenula (LH) and hippocampus (HPC) of the freely behaving mouse. Mice were trained to run for a food reward (sucrose pellets) on a circular track with four possible reward locations (NE, SE, SW, NW). In each trial only 2 locations contain reward (NE/SW or SE/NW), while at the non-rewarded locations the mouse receives a small aversive air puff , with the location of the rewards and airpuffs pseudo-randomly alternated between trials. In this paradigm we have identified single units in the LH that show a robust increase in their firing rate at the locations the air puffs are delivered. These aversive stimuli responsive neurons come in at least two types; a regular spiking, non-bursting type that exhibits no theta-modulation and a relatively narrow waveform and a theta-modulated bursting type with a relatively wide waveform. These data are being employed to better characterize this circuit in the mouse and understand its interactions with the spatial representation present in the hippocampus.

上部に戻る 前に戻る